The State of the Art in Image and Video Retrieval

نویسندگان

  • Nicu Sebe
  • Michael S. Lew
  • Xiang Sean Zhou
  • Thomas S. Huang
  • Erwin M. Bakker
چکیده

Image and video retrieval continues to be one of the most exciting and fastest-growing research areas in the field of multimedia technology. What are the main challenges in image and video retrieval? Despite the sustained efforts in the last years, we think that the paramount challenge remains bridging the semantic gap. By this we mean that low level features are easily measured and computed, but the starting point of the retrieval process is typically the high level query from a human. Translating or converting the question posed by a human to the low level features seen by the computer illustrates the problem in bridging the semantic gap. However, the semantic gap is not merely translating high level features to low level features. The essence of a semantic query is understanding the meaning behind the query. This can involve understanding both the intellectual and emotional sides of the human, not merely the distilled logical portion of the query but also the personal preferences and emotional subtones of the query and the preferential form of the results. Another important aspect is that digital cameras are becoming widely available. The combined capacity to generate bits of these devices is not easy to express in ordinary numbers. And, at the same time, the growth in computer speed, disk capacity, and most of all the rapid expansion of the web will export these bits to wider and wider circles. The immediate question is what to do with all the information. One could store the digital information on tapes, CD-ROMs, DVDs or any such device but the level of access would be less than the well-known shoe boxes filled with tapes, old photographs, and letters. What is needed is that the techniques for organizing images and video stay in tune with the amounts of information. Therefore, there is an urgent need for a semantic understanding of image and video. Creating access to still images is still hard problem. It requires hard work, precise modeling, the inclusion of considerable amounts of a priori knowledge and solid experimentation to analyze the contents of a photograph. Luckily, it can be argued that the access to video is somehow a simpler problem than access to still images. Video comes as a sequence, so what moves together most likely forms an entity in real life, so segmentation of video is intrinsically simpler than

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Approach to Background Subtraction Using Visual Saliency Map

Generally human vision system searches for salient regions and movements in video scenes to lessen the search space and effort. Using visual saliency map for modelling gives important information for understanding in many applications. In this paper we present a simple method with low computation load using visual saliency map for background subtraction in video stream. The proposed technique i...

متن کامل

Connected Component Based Word Spotting on Persian Handwritten image documents

Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...

متن کامل

Semiautomatic Image Retrieval Using the High Level Semantic Labels

Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...

متن کامل

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

Performance Evaluation of Medical Image Retrieval Systems Based on a Systematic Review of the Current Literature

Background and Aim: Image, as a kind of information vehicle which can convey a large volume of information, is important especially in medicine field. Existence of different attributes of image features and various search algorithms in medical image retrieval systems and lack of an authority to evaluate the quality of retrieval systems, make a systematic review in medical image retrieval system...

متن کامل

Image retrieval using the combination of text-based and content-based algorithms

Image retrieval is an important research field which has received great attention in the last decades. In this paper, we present an approach for the image retrieval based on the combination of text-based and content-based features. For text-based features, keywords and for content-based features, color and texture features have been used. Query in this system contains some keywords and an input...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003